132 research outputs found
ManimML: Communicating Machine Learning Architectures with Animation
There has been an explosion in interest in machine learning (ML) in recent
years due to its applications to science and engineering. However, as ML
techniques have advanced, tools for explaining and visualizing novel ML
algorithms have lagged behind. Animation has been shown to be a powerful tool
for making engaging visualizations of systems that dynamically change over
time, which makes it well suited to the task of communicating ML algorithms.
However, the current approach to animating ML algorithms is to handcraft
applications that highlight specific algorithms or use complex generalized
animation software. We developed ManimML, an open-source Python library for
easily generating animations of ML algorithms directly from code. We sought to
leverage ML practitioners' preexisting knowledge of programming rather than
requiring them to learn complex animation software. ManimML has a familiar
syntax for specifying neural networks that mimics popular deep learning
frameworks like Pytorch. A user can take a preexisting neural network
architecture and easily write a specification for an animation in ManimML,
which will then automatically compose animations for different components of
the system into a final animation of the entire neural network. ManimML is open
source and available at https://github.com/helblazer811/ManimML
MalNet: A Large-Scale Cybersecurity Image Database of Malicious Software
Computer vision is playing an increasingly important role in automated
malware detection with to the rise of the image-based binary representation.
These binary images are fast to generate, require no feature engineering, and
are resilient to popular obfuscation methods. Significant research has been
conducted in this area, however, it has been restricted to small-scale or
private datasets that only a few industry labs and research teams have access
to. This lack of availability hinders examination of existing work, development
of new research, and dissemination of ideas. We introduce MalNet, the largest
publicly available cybersecurity image database, offering 133x more images and
27x more classes than the only other public binary-image database. MalNet
contains over 1.2 million images across a hierarchy of 47 types and 696
families. We provide extensive analysis of MalNet, discussing its properties
and provenance. The scale and diversity of MalNet unlocks new and exciting
cybersecurity opportunities to the computer vision community--enabling
discoveries and research directions that were previously not possible. The
database is publicly available at www.mal-net.org
TopicViz: Semantic Navigation of Document Collections
When people explore and manage information, they think in terms of topics and
themes. However, the software that supports information exploration sees text
at only the surface level. In this paper we show how topic modeling -- a
technique for identifying latent themes across large collections of documents
-- can support semantic exploration. We present TopicViz, an interactive
environment for information exploration. TopicViz combines traditional search
and citation-graph functionality with a range of novel interactive
visualizations, centered around a force-directed layout that links documents to
the latent themes discovered by the topic model. We describe several use
scenarios in which TopicViz supports rapid sensemaking on large document
collections
- …